Systemeigenschaften

Verwendete Pakete

R-Version

## $platform
## [1] "x86_64-pc-linux-gnu"
## 
## $arch
## [1] "x86_64"
## 
## $os
## [1] "linux-gnu"
## 
## $system
## [1] "x86_64, linux-gnu"
## 
## $status
## [1] ""
## 
## $major
## [1] "3"
## 
## $minor
## [1] "6.2"
## 
## $year
## [1] "2019"
## 
## $month
## [1] "12"
## 
## $day
## [1] "12"
## 
## $`svn rev`
## [1] "77560"
## 
## $language
## [1] "R"
## 
## $version.string
## [1] "R version 3.6.2 (2019-12-12)"
## 
## $nickname
## [1] "Dark and Stormy Night"

Namen und Labels der Variablen

names labels
beg_hw_min Beginn der Hausaufgabenvergabe in Minuten nach geplantem Stundenbeginn
dau_hw_min Dauer der Hausaufgabenvergabe in Minuten
stunde Länge der Unterrichtsstunde in Minuten
Schulart Schulart [Grundschule/ Gymnasium]
Klassenstufe Klassenstufe [1 bis 12]
lh_ank L kündigt HA-Stellung an (verbal) [ja/nein]
lh_auf L verlangt Aufmerksamkeit (verbal/nonverbal) [ja/nein]
lh_nen L nennt HA [ja/nein]
lh_sch L schreibt HA an die Tafel [ja/nein]
lh_erl L erläutert Hausaufgabe / lässt erläutern [ja/nein]
lh_wfr L fordert zu Fragen auf [ja/nein]
lh_bfr L beantwortet Fragen: Wie viele werden beantwortet? [Wie viele?]
lh_wno L fordert zum Notieren auf [ja/nein]
lh_ano L achtet darauf, dass die HA notiert werden [ja/nein]
sh_auf S sind aufmerksam [trifft zu, trifft eher zu, trifft teils-teils zu, trifft eher nicht zu, trifft nicht zu]
sh_mel S melden sich, um zu fragen: Wie viele Meldungen? [Wie viele?]
sh_fra S fragen mit oder ohne Meldung: Wie viele Fragen werden tatsächlich gestellt? [Wie viele?]
sh_not S notieren HA: Wie viele S ungefähr in % geschätzt? [Wie viele?]

Datenvorbereitung

Wir können die Daten aktuell leider nicht teilen. Bei konkreten Anfragen, bitte Mail an Autor*innen (klick auf Namen zu Dokumentbeginn).

Grundschule

gs <- gs %>%
  dplyr::filter(`Laufende\nNummer` != "Beispiel") %>% # Beispielcodierungen löschen
  mutate(id = as.numeric(`Laufende\nNummer`))         # Laufende Nummer als id

## Korrektur von Daten auf Basis der Abweichungen
 # Falschberechnungen der `Min Beginn seit Std-Anfang` (id == 189, 248, 329) werden übersprungen
 # da die selbst berechnete variable `beg_hw_min` verwendet wird
 # aufgrund der unterschiedlichen Zeitzone muss zusätzlich noch eine h drauf gerechnet werden
gs[which(gs$id == 105), "Uhrzeit Beginn HA Vergabe"] <- "1899-12-31 11:48:00"    # Tippfehler, nochmals in FB nachgeschaut
gs[which(gs$id == 105), "Uhrzeit Ende Vergabe"] <- "1899-12-31 11:51:00"         # Tippfehler, nochmals in FB nachgeschaut
gs[which(gs$id == 120), "Uhrzeit Beginn HA Vergabe"] <- "1899-12-31 11:58:00"    # Tippfehler, nochmals in FB nachgeschaut
gs[which(gs$id == 168), "Uhrzeit Beginn HA Vergabe"] <- "1899-12-31 11:43:00"    # Tippfehler, nochmals in FB nachgeschaut
gs[which(gs$id == 250), "Uhrzeit Ende Vergabe"] <- "1899-12-31 12:33:00"         # Tippfehler, nochmals in FB nachgeschaut
gs[which(gs$id == 187), "Uhrzeit Beginn der Std laut Plan"] <- "1899-12-31 09:20:00" # Tippfehler, nochmals in FB nachgeschaut
gs[which(gs$id == 187), "Uhrzeit Ende der Std laut Plan"] <- "1899-12-31 10:50:00" # Tippfehler, nochmals in FB nachgeschaut
gs[which(gs$id == 187), "Uhrzeit Beginn HA Vergabe"] <- "1899-12-31 10:24:00"    # Tippfehler, nochmals in FB nachgeschaut
gs[which(gs$id == 187), "Uhrzeit Ende Vergabe"] <- "1899-12-31 10:40:00"         # Tippfehler, nochmals in FB nachgeschaut
gs[which(gs$id == 307), "Uhrzeit Ende Vergabe"] <- "1899-12-31 10:16:00"         # Tippfehler, nochmals in FB nachgeschaut
gs[which(gs$id == 315), "Uhrzeit Ende Vergabe"] <- "1899-12-31 11:55:00"         # Tippfehler, nochmals in FB nachgeschaut
gs[which(gs$id == 220), "Ankündigung"] <- 1 # Tippfehler
gs[which(gs$id == 198), "L schreibt"] <- 1 # Tippfehler 
gs[which(gs$id == 330), "L will Notation"] <- NA # Wert "9" gibt es nicht, nicht nachvollziehbar, welcher Wert plausibel ist


# new var
gs <- gs %>%
  mutate(beg_plan = hm(format(strptime(`Uhrzeit Beginn der Std laut Plan`,"%Y-%m-%d %H:%M:%S"),'%H:%M')),
         end_plan = hm(format(strptime(`Uhrzeit Ende der Std laut Plan`,"%Y-%m-%d %H:%M:%S"),'%H:%M')),
         beg_act  = hm(format(strptime(`Uhrzeit Realer Beginn der Std`,"%Y-%m-%d %H:%M:%S"),'%H:%M')),
         beg_hw   = hm(format(strptime(`Uhrzeit Beginn HA Vergabe`,"%Y-%m-%d %H:%M:%S"),'%H:%M')),
         end_hw   = hm(format(strptime(`Uhrzeit Ende Vergabe`,"%Y-%m-%d %H:%M:%S"),'%H:%M')),
         stunde   = lubridate::hour(end_plan - beg_plan)*60 + lubridate::minute(end_plan - beg_plan),     # geplante Länge der Unterrichtsstunde
         beg_hw_min = lubridate::hour(beg_hw - beg_plan)*60 + lubridate::minute(beg_hw - beg_plan),       # Beginn der HA relativ zu GEPLANTEM h-Anfang
         dau_hw_min = lubridate::hour(end_hw - beg_hw)*60 + lubridate::minute(end_hw - beg_hw),
         lh_ank   = `Ankündigung`,
         lh_auf   = `L verlangt Aufmerk`,
         lh_nen   = `L nennt`,
         lh_sch   = `L schreibt`,
         lh_erl   = `L erläutert`,
         lh_wfr   = `L will Fragen`,
         lh_bfr   = `L beantwortet`,
         lh_wno   = `L will Notation`,
         lh_ano   = `L achtet Notat`,
         sh_auf = `S aufmerksam`,
         sh_mel = `S melden`,
         sh_fra = `S fragen`,
         sh_not = `S notieren`)

Imputation der fehlenden Daten

Skalenniveaus

  • lh_ank: dichotom [0,1]
  • lh_auf: dichotom [0,1]
  • lh_nen: dichotom [0,1]
  • lh_sch: dichotom [0,1]
  • lh_erl: dichotom [0,1]
  • lh_wfr: dichotom [0,1]
  • lh_bfr: metrisch, absolutskaliert
  • lh_wno: dichotom [0,1]
  • lh_ano: dichotom [0,1]
  • sh_auf: ordinalskaliert [1:5]
  • sh_mel: metrisch, absolutskaliert
  • sh_fra: metrisch, absolutskaliert
  • sh_not: metrisch, intervllskaliert

fehlende Daten im Datensatz

Kombinationen der fehlenden Daten

Entscheidungen

bei der Imputation

imputation model

für jede Variable

variable scale.type method
beg_hw_min metric pmm
dau_hw_min metric pmm
stunde metric pmm
Schulart nominal polyreg
Klassenstufe metric pmm
lh_ank binary logreg
lh_auf binary logreg
lh_nen binary logreg
lh_sch binary logreg
lh_erl binary logreg
lh_wfr binary logreg
lh_bfr metric pmm
lh_wno binary logreg
lh_ano binary logreg
sh_auf metric pmm
sh_mel metric pmm
sh_fra metric pmm
sh_not metric pmm
Schulart_n metric ~ifelse(Schulart == ‘Grundschule’, 0, 1)

Defining methods.

number of imputations
1000

selection of predictors

Auf Multikollinearität überprüfen:

Drei Variablen werden als Prädiktoren ausgeschlossen: stunde, lh_bfr, Schulart_n.

variables that are function of other variables
Schulart_n ist eine nummerische Version von Schulart, siehe Imputationsmthode.

which variables to impute
Alle.

number of iterations
20

check imputation

plausible values
Dieser Code wird je Imputation (1000) einen Plot generieren. Falls interessiert, kann dies auskommentiert werden, ansonsten spare ich den Platz.

All values seem plausible.

check convergence

Bayes Factor Design Analysis für Kreuztabellen

Bayes Factor Design Analysis zur Bestimmung des BF Thresholds.

Effektstärke

Aufgrund fehlender Referenzwerte nehmen wir eine Effektstärke von \(\varphi=.2\) an. Dieser liegt zwischen einem kleinen (\(\varphi=.1\)) und mittleren (\(\varphi=.3\)) Effekt nach Cohen (1988).

Anzahl Simulationen

Es werden 1000 Simulationen durchgeführt. Aufgrund der Robustheit der Ergebnisse in diesem Bereich, verzichteten wir auf eine größere Anzahl an Simulationen.

Ergebnisse Simulation

sim_cor_results <- data.frame()   # set up data frame for results

for(j in true_hyp) {                               # loop over both hypotheses
  if (j == "H0 is true")
  bincorr <- matrix(c(1,phi[1],phi[1],1), ncol=2)  # create correlation matrix for H0
  if (j == "H1 is true")
  bincorr <- matrix(c(1,phi[2],phi[2],1), ncol=2)  # create correlation matrix for H1
  
  for (n in 1:n_sim) {
    sim_df <- rmvbin(n = 510,
                     margprob = c(0.5, 0.5),
                     bincorr = bincorr)
    
    sim_imp_m <- matrix(table(sim_df[,1], sim_df[,2]), 2, 2)
    sim_fit <- contingencyTableBF(sim_imp_m, sampleType="indepMulti",fixedMargin = "cols")
    sim_cor_results[n+ifelse(j == "H0 is true", 0, n_sim), "BayesFactor"] <- extractBF(sim_fit)$bf
    sim_cor_results[n+ifelse(j == "H0 is true", 0, n_sim), "trueH"] <- j
    rm(sim_imp_m, sim_fit)
  }
}

# categorize if result is correct, incorrect or inconclusive
sim_cor_results <- sim_cor_results %>%
  mutate(BF3 = case_when(
                    BayesFactor >= 3 & trueH == "H0 is true" ~ "incorrect",
                    BayesFactor < 3 & BayesFactor > (1/3) & trueH == "H0 is true" ~ "inconclusive",
                    BayesFactor <= (1/3) & trueH == "H0 is true" ~ "correct",
                    BayesFactor >= 3 & trueH == "H1 is true" ~ "correct",
                    BayesFactor < 3 & BayesFactor > (1/3) & trueH == "H1 is true" ~ "inconclusive",
                    BayesFactor <= (1/3) & trueH == "H1 is true" ~ "incorrect"),
         BF5 = case_when(
                    BayesFactor >= 5 & trueH == "H0 is true" ~ "incorrect",
                    BayesFactor < 5 & BayesFactor > (1/5) & trueH == "H0 is true" ~ "inconclusive",
                    BayesFactor <= (1/5) & trueH == "H0 is true" ~ "correct",
                    BayesFactor >= 5 & trueH == "H1 is true" ~ "correct",
                    BayesFactor < 5 & BayesFactor > (1/5) & trueH == "H1 is true" ~ "inconclusive",
                    BayesFactor <= (1/5) & trueH == "H1 is true" ~ "incorrect"),
         BF10 = case_when(
                    BayesFactor >= 10 & trueH == "H0 is true" ~ "incorrect",
                    BayesFactor < 10 & BayesFactor > (1/10) & trueH == "H0 is true" ~ "inconclusive",
                    BayesFactor <= (1/10) & trueH == "H0 is true" ~ "correct",
                    BayesFactor >= 10 & trueH == "H1 is true" ~ "correct",
                    BayesFactor < 10 & BayesFactor > (1/10) & trueH == "H1 is true" ~ "inconclusive",
                    BayesFactor <= (1/10) & trueH == "H1 is true" ~ "incorrect"),
         )

# pivot into long data frame for plot
sim_cor_results_l <- pivot_longer(sim_cor_results, 
                                  cols = 3:5, 
                                  names_to = "BF Threshold",
                                  values_to = "decision") 
# order factor for plot
sim_cor_results_l$`BF Threshold` <- factor(sim_cor_results_l$`BF Threshold`, levels = c("BF3", "BF5", "BF10"))


# hrbrthemes::import_roboto_condensed()

ggplot(sim_cor_results_l, aes(trueH, fill = decision)) +
    geom_bar(position = "fill") +
    geom_text(aes(label=round(..count../n_sim*100), y= ..count../n_sim),
    position =position_stack(vjust = 0.5), stat= "count",
    color = "white", size = 5) +
    coord_flip() +
    facet_wrap(~`BF Threshold`, ncol = 1) +
    labs(title = "Results of the Bayes Factor Design Analysis",
         subtitle = "For three different Bayes Factors",
         caption = paste("In % (rounded), based on", n_sim, "simulations")) +
    xlab("True Hypothesis") +
    scale_fill_viridis_d() +
    theme_ipsum_rc()

Die Ergebnisse zeigen, dass bei einem BF von 3 kaum falsch-positive Ergebnisse zustande kommen (~1%) und die Power jeweils zufriedenstellend bzw. sehr hoch ist. Es ist somit nicht nötig einen höheren BF zu verweneden, um falsch-positive Ergebnisse zu vermeiden. Höhere BFs hätten zudem den Nachteil, dass vermehrt inkonklusive Ergebnisse auftreten. Wir verwenden bei der Auswertung der binären Zusammenhänge (Kreuztabellen) somit einen Threshold von \(BF=3\) bzw. \(BF=\frac{1}{3}\).

Bayes Factor Design Analysis für t-Tests

Bayes Factor Design Analysis zur Bestimmung des BF Thresholds.

Effektstärke

Aufgrund fehlender Referenzwerte nehmen wir eine Effektstärke von \(d=.35\) an. Dieser liegt zwischen einem kleinen (\(d=.2\)) und mittleren (\(d=.5\)) Effekt nach Cohen (1988).

Anzahl Simulationen

Es werden 1000 Simulationen durchgeführt. Aufgrund der Robustheit der Ergebnisse in diesem Bereich, verzichteten wir auf eine größere Anzahl an Simulationen.

Ergebnisse Simulation

sim_ttest_results <- data.frame()   # set up data frame for results

for(j in true_hyp) {                            # loop over both hypotheses

  for (n in 1:n_sim) {                             # loop over all simulations
      sim_ttest_df <- data.frame(mvrnorm(n = 510,                         # fixed n of 510
                                     mu = if(j == "H0 is true")
                                          c(0,true_d[1]) else              # create data set for H0
                                          if(j == "H1 is true")
                                          c(0,true_d[2]),             # create data set for H1
                                     Sigma = matrix(c( 1, .5,         # vcov matrix
                                                      .5,  1),
                                                      2, 2)))
      
    # pivot longer to insert it into a lm
    sim_ttest_df_l <- pivot_longer(sim_ttest_df, 1:2, names_to = "group", values_to = "dependentVar")
    
    ### LINEAR MODEL ############################################ #
    # compute the means of each group
    sim_fit_ttest <- lm(dependentVar ~ group-1, 
                        data = sim_ttest_df_l)
    ### BAIN #################################################### #    
    # generating hypotheses
    hypotheses <- "groupX1 = groupX2; groupX1 < groupX2"   #H1 and H2 respectively
    
    bf_ttest <- bain(sim_fit_ttest, 
                     hypothesis = hypotheses
                     )
    
    sim_ttest_results[n+ifelse(j == "H0 is true", 0, n_sim),"BayesFactor"] <- bf_ttest$BFmatrix[2,1]  # BF(H2,H1)
    sim_ttest_results[n+ifelse(j == "H0 is true", 0, n_sim), "trueH"] <- j
    rm(bf_ttest, sim_fit_ttest, sim_ttest_df, sim_ttest_df_l)
  }
}


# categorize if result is correct, incorrect or inconclusive
sim_ttest_results <- sim_ttest_results %>%
  mutate(BF3 = case_when(
                    BayesFactor >= 3 & trueH == "H0 is true" ~ "incorrect",
                    BayesFactor < 3 & BayesFactor > (1/3) & trueH == "H0 is true" ~ "inconclusive",
                    BayesFactor <= (1/3) & trueH == "H0 is true" ~ "correct",
                    BayesFactor >= 3 & trueH == "H1 is true" ~ "correct",
                    BayesFactor < 3 & BayesFactor > (1/3) & trueH == "H1 is true" ~ "inconclusive",
                    BayesFactor <= (1/3) & trueH == "H1 is true" ~ "incorrect"),
         BF5 = case_when(
                    BayesFactor >= 5 & trueH == "H0 is true" ~ "incorrect",
                    BayesFactor < 5 & BayesFactor > (1/5) & trueH == "H0 is true" ~ "inconclusive",
                    BayesFactor <= (1/5) & trueH == "H0 is true" ~ "correct",
                    BayesFactor >= 5 & trueH == "H1 is true" ~ "correct",
                    BayesFactor < 5 & BayesFactor > (1/5) & trueH == "H1 is true" ~ "inconclusive",
                    BayesFactor <= (1/5) & trueH == "H1 is true" ~ "incorrect"),
         BF10 = case_when(
                    BayesFactor >= 10 & trueH == "H0 is true" ~ "incorrect",
                    BayesFactor < 10 & BayesFactor > (1/10) & trueH == "H0 is true" ~ "inconclusive",
                    BayesFactor <= (1/10) & trueH == "H0 is true" ~ "correct",
                    BayesFactor >= 10 & trueH == "H1 is true" ~ "correct",
                    BayesFactor < 10 & BayesFactor > (1/10) & trueH == "H1 is true" ~ "inconclusive",
                    BayesFactor <= (1/10) & trueH == "H1 is true" ~ "incorrect"),
         )

# pivot into long data frame for plot
sim_ttest_results_l <- pivot_longer(sim_ttest_results, 
                                    cols = 3:5, 
                                    names_to = "BF Threshold",
                                    values_to = "decision") 
# order factor for plot
sim_ttest_results_l$`BF Threshold` <- factor(sim_ttest_results_l$`BF Threshold`, levels = c("BF3", "BF5", "BF10"))


# hrbrthemes::import_roboto_condensed()

ggplot(sim_ttest_results_l, aes(trueH, fill = decision)) +
    geom_bar(position = "fill") +
    geom_text(aes(label=round(..count../n_sim*100), y= ..count../n_sim),
    position =position_stack(vjust = 0.5), stat= "count",
    color = "white", size = 5) +
    coord_flip() +
    facet_wrap(~`BF Threshold`, ncol = 1) +
    labs(title = "Results of the Bayes Factor Design Analysis",
         subtitle = "For three different Bayes Factors",
         caption = paste("In % (rounded), based on", n_sim, "simulations")) +
    xlab("True Hypothesis") +
    scale_fill_viridis_d() +
    theme_ipsum_rc()

Hier zeigen sich bei einem \(N=510\) und einem angenommenen Cohen’s \(d=.35\) ebenfalls (nahezu) keine falsch-positiven Ergebnisse. Somit wird auch hier auf ein Threshold von \(BF=3\) bzw. \(BF=\frac{1}{3}\) verwendet.

Zeitpunkt der HA-Vergabe

Deskriptive Daten Zeitpuknt & Dauer

Deskriptive Daten Einzelstunden

  • Grundschule:

    Table continues below
      vars n mean sd median trimmed mad min
    beg_hw_min 1 270 33.330 13.251 39 34.745 5.930 1
    dau_hw_min 2 267 4.524 3.150 4 4.074 1.483 0
      max range skew kurtosis se
    beg_hw_min 83 82 -0.728 0.366 0.806
    dau_hw_min 25 25 2.133 7.826 0.193
  • Gymnasium:

    Table continues below
      vars n mean sd median trimmed mad min
    beg_hw_min 1 143 38.839 8.347 42 40.504 2.965 1
    dau_hw_min 2 142 2.937 2.216 2 2.667 1.483 0
      max range skew kurtosis se
    beg_hw_min 49 48 -2.628 7.853 0.698
    dau_hw_min 15 15 2.063 7.003 0.186
## [1] "Modus der Hausaufgabenvergabe in 45min Stunden Grundschule = 41.1"
## [1] "Modus der Hausaufgabenvergabe in 45min Stunden Gymnasium = 42.9"


Deskriptive Daten Doppelstunden

  • Grundschule:

    Table continues below
      vars n mean sd median trimmed mad min
    beg_hw_min 1 26 69.346 24.095 80.0 73.545 7.413 5
    dau_hw_min 2 26 8.269 7.411 5.5 7.091 3.706 1
      max range skew kurtosis se
    beg_hw_min 85 80 -1.732 1.692 4.725
    dau_hw_min 35 34 2.056 4.239 1.453
  • Gymnasium:

    Table continues below
      vars n mean sd median trimmed mad min
    beg_hw_min 1 42 79.048 15.862 85 82.206 5.930 29
    dau_hw_min 2 40 3.825 2.772 3 3.438 2.965 1
      max range skew kurtosis se
    beg_hw_min 95 66 -1.723 2.020 2.448
    dau_hw_min 13 12 1.169 1.263 0.438
## [1] "Modus der Hausaufgabenvergabe in 90min Stunden Grundschule = 81.8"
## [1] "Modus der Hausaufgabenvergabe in 90min Stunden Gymnasium = 87.2"

45min Stunden

Prädiktor Schulart

Deskriptiver Plot


Inferenzstatistik

#### IMPUTATION #### #
# separate Imputation für 45min
p_data_45 <- p_data %>%
    dplyr::filter(stunde==45)

imp45 <- mice(p_data_45, 
              maxit = maxit, 
              m = m,
              meth = meth,
              pred = pred,
              seed = 666,
              printFlag = F
              )

#### Compute BFs within informed hypotheses framework (bain) ##################################### #

### for bain: compute vcov by hand ###
# create data frame to collect var and cov for each imputation
vcov_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

mean_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

# sample size per group
samp_gr <- table(mice::complete(imp45)$Schulart)

## loop over all m imputations
for(i in 1:m) {
  # fit model
  fit_lm <- lm(beg_hw_min ~ Schulart-1, data = mice::complete(imp45, i))
  
  var_lm <- summary(fit_lm)$sigma**2 # get variance of the means (VOM)
  vcov_df[i, "Grundschule"] <- var_lm/samp_gr["Grundschule"] # compute VOM per group
  vcov_df[i, "Gymnasium"] <- var_lm/samp_gr["Gymnasium"] # compute VOM per group
  
  # collect estimates of the means
  mean_df[i, "Grundschule"] <- coef(fit_lm)["SchulartGrundschule"]
  mean_df[i, "Gymnasium"] <- coef(fit_lm)["SchulartGymnasium"]
  
  rm(fit_lm, var_lm) # clean up, because we like it tidy in here
}


## make var matrices 
# compute the mean over var
vcov_df <- vcov_df %>%
        summarize_all(mean)

# create matrices
mat1 <- matrix(vcov_df$Grundschule, 1, 1)
mat2 <- matrix(vcov_df$Gymnasium, 1, 1)

variances <- list(mat1, mat2)

## compute mean of estimates
bf_data <- mean_df %>%
        summarize_all(mean)
bf_data <- as.numeric(bf_data)
names(bf_data) <- c("Grund", "Gym")

## BAIN ##### #
# generating hypotheses
hypotheses <- "Gym = Grund; Gym < Grund; Gym > Grund"

bf_45 <- bain(bf_data, 
              hypothesis = hypotheses,
              n = samp_gr,           # size of the groups
              Sigma = variances,     # matrices of residual variances of groups
              group_parameters = 1,  # there is 1 group specific parameter (the mean in each group)
              joint_parameters = 0   # there are no parameters that apply to each of the groups (e.g. the regression coefficient of a covariate)
              )

print(bf_45)
## Bayesian informative hypothesis testing for an object of class numeric:
## 
##    Fit_eq Com_eq Fit_in Com_in Fit   Com   BF         PMPa  PMPb 
## H1 0.000  0.017  1.000  1.000  0.000 0.017 0.001      0.000 0.000
## H2 1.000  1.000  0.000  0.500  0.000 0.500 0.000      0.000 0.000
## H3 1.000  1.000  1.000  0.500  1.000 0.500 333236.770 1.000 0.667
## Hu                                                          0.333
## 
## Hypotheses:
##   H1: Gym=Grund
##   H2: Gym<Grund
##   H3: Gym>Grund
## 
## Note: BF denotes the Bayes factor of the hypothesis at hand versus its complement.
##              H1          H2           H3
## H1 1.000000e+00    115.0622 3.452867e-04
## H2 8.690949e-03      1.0000 3.000869e-06
## H3 2.896144e+03 333236.7735 1.000000e+00

Prädiktor Klassenstufe

Deskriptiver Plot

p_data_ridges <- p_data %>%
    dplyr::filter(!is.na(Klassenstufe)) %>%
    mutate(Klassenstufe = factor(Klassenstufe, levels = c(1,2,3,4,5,6,7,8,9,10,11,12)))

ridges_length45 <- p_data_ridges %>%
    dplyr::filter(stunde == 45) %>%
    group_by(Klassenstufe) %>%
    summarize_all(length) 

names(ridges_length45) <- paste(names(ridges_length45), "_n45", sep="")
names(ridges_length45)[1] <- "Klassenstufe"

ridges_length90 <- p_data_ridges %>%
    dplyr::filter(stunde == 90) %>%
    group_by(Klassenstufe) %>%
    summarize_all(length) 

names(ridges_length90) <- paste(names(ridges_length90), "_n90", sep="")
names(ridges_length90)[1] <- "Klassenstufe"

ridges_length <- p_data_ridges %>%
    group_by(Klassenstufe) %>%
    summarize_all(length) 

names(ridges_length) <- paste(names(ridges_length), "_n", sep="")
names(ridges_length)[1] <- "Klassenstufe"

p_data_ridges <- left_join(p_data_ridges, ridges_length45, by="Klassenstufe")
p_data_ridges <- left_join(p_data_ridges, ridges_length90, by="Klassenstufe")
p_data_ridges <- left_join(p_data_ridges, ridges_length, by="Klassenstufe")


ggplot(p_data_ridges%>%dplyr::filter(stunde == 45), aes(x=as.factor(Klassenstufe), y = beg_hw_min, fill = Schulart, 
                                                        colour = Schulart, group = as.factor(Klassenstufe))) +
    geom_flat_violin(adjust = 2, 
                     trim = F, 
                     alpha = .3, 
                     # scale = "count", 
                     width = 3) +
    geom_hline(yintercept = 45, linetype = "dashed", colour = "#696f71", size = 1) +
    stat_summary(fun.y = mean, geom = "line", aes(group = 1), position = position_nudge(x=.1), size = 1, color = "grey") +
    stat_summary(aes(size = beg_hw_min_n45), fun.y = mean, geom = "point",  
                 alpha = .85, position = position_nudge(x=.1)) +
    scale_size(range = c(.3,7.7)) +
    scale_colour_brewer(palette = "Set1")+
    scale_fill_brewer(palette = "Set1") +
    scale_y_continuous(expand = c(0, 0), breaks = c(0, 10,20,30,40,50), limits = c(0, 55)) +
    scale_x_discrete(expand = c(0, 0)) +
    ylab("Minuten seit Stundenbeginn bei Vergabe der HA") +
    labs(size = "Anzahl\neingegangener\nStunden") +
    xlab("Klassenstufe") +
    ggtitle("Einzelstunden") +
    theme_light() 


Inferenzstatistik
Aufgrund fehlender Standards im pooling von Bayes Faktoren (BF), berechnen wir für jeden Datensatz einen Bayes Faktor und berichten anschließend deren Verteilung.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   439.2  1498.7  1934.4  2057.6  2419.9  6291.9

90min Stunden

Prädiktor Schulart

Deskriptiver Plot


Inferenzstatistik

#### IMPUTATION #### #
# separate Imputation für 90min
p_data_90 <- p_data %>%
    dplyr::filter(stunde==90)

imp90 <- mice(p_data_90, 
              maxit = maxit, 
              m = m,
              meth = meth,
              pred = pred,
              seed = 666,
              printFlag = F
              )

#### Compute BFs within informed hypotheses framework (bain) ##################################### #

### for bain: compute vcov by hand ###
# create data frame to collect var and cov for each imputation
vcov_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

mean_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

# sample size per group
samp_gr <- table(mice::complete(imp90)$Schulart)


## loop over all m imputations
for(i in 1:m) {
  fit_lm <- lm(beg_hw_min ~ Schulart-1, data = mice::complete(imp90, i))
  
  var_lm <- summary(fit_lm)$sigma**2 # get variance of the means (VOM)
  vcov_df[i, "Grundschule"] <- var_lm/samp_gr["Grundschule"] # compute VOM per group
  vcov_df[i, "Gymnasium"] <- var_lm/samp_gr["Gymnasium"] # compute VOM per group
  
  # collect estimates of the means
  mean_df[i, "Grundschule"] <- coef(fit_lm)["SchulartGrundschule"]
  mean_df[i, "Gymnasium"] <- coef(fit_lm)["SchulartGymnasium"]
  
  rm(fit_lm, var_lm) # clean up, because we like it tidy in here
}

## make var matrices 
# compute the mean over var
vcov_df <- vcov_df %>%
        summarize_all(mean)

# create matrices
mat1 <- matrix(vcov_df$Grundschule, 1, 1)
mat2 <- matrix(vcov_df$Gymnasium, 1, 1)

variances <- list(mat1, mat2)

## compute mean of estimates
bf_data <- mean_df %>%
        summarize_all(mean)
bf_data <- as.numeric(bf_data)
names(bf_data) <- c("Grund", "Gym")


## BAIN ##### #
# generating hypotheses
hypotheses <- "Gym = Grund; Gym < Grund; Gym > Grund"

bf_90 <- bain(bf_data, 
              hypothesis = hypotheses,
              n = samp_gr,           # size of the groups
              Sigma = variances,     # matrices of residual variances of groups
              group_parameters = 1,  # there is 1 group specific parameter (the mean in each group)
              joint_parameters = 0   # there are no parameters that apply to each of the groups (e.g. the regression coefficient of a covariate)
              )

print(bf_90)
## Bayesian informative hypothesis testing for an object of class numeric:
## 
##    Fit_eq Com_eq Fit_in Com_in Fit   Com   BF     PMPa  PMPb 
## H1 0.011  0.010  1.000  1.000  0.011 0.010 1.075  0.350 0.264
## H2 1.000  1.000  0.023  0.500  0.023 0.500 0.023  0.015 0.011
## H3 1.000  1.000  0.977  0.500  0.977 0.500 43.415 0.636 0.480
## Hu                                                      0.245
## 
## Hypotheses:
##   H1: Gym=Grund
##   H2: Gym<Grund
##   H3: Gym>Grund
## 
## Note: BF denotes the Bayes factor of the hypothesis at hand versus its complement.
##            H1       H2         H3
## H1 1.00000000 23.87773 0.54998494
## H2 0.04188002  1.00000 0.02303338
## H3 1.81823161 43.41525 1.00000000

Prädiktor Klassenstufe

Deskriptiver Plot


Inferenzstatistik
Aufgrund fehlender Standards im pooling von Bayes Faktoren (BF), berechnen wir für jeden Datensatz einen Bayes Faktor und berichten anschließend deren Verteilung.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   6.937   6.937   6.937   6.937   6.937   6.937

Zeitbedarf HA

45min Stunden

Prädiktor Schulart

Deskriptiver Plot


Inferenzstatistik

#### Compute BFs within informed hypotheses framework (bain) ##################################### #

### for bain: compute vcov by hand ###
# create data frame to collect var and cov for each imputation
vcov_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

mean_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

# sample size per group
samp_gr <- table(mice::complete(imp45)$Schulart)


## loop over all m imputations
for(i in 1:m) {
  fit_lm <- lm(dau_hw_min ~ Schulart-1, data = mice::complete(imp45, i))
  
  var_lm <- summary(fit_lm)$sigma**2 # get variance of the means (VOM)
  vcov_df[i, "Grundschule"] <- var_lm/samp_gr["Grundschule"] # compute VOM per group
  vcov_df[i, "Gymnasium"] <- var_lm/samp_gr["Gymnasium"] # compute VOM per group
  
  # collect estimates of the means
  mean_df[i, "Grundschule"] <- coef(fit_lm)["SchulartGrundschule"]
  mean_df[i, "Gymnasium"] <- coef(fit_lm)["SchulartGymnasium"]
  
  rm(fit_lm, var_lm) # clean up, because we like it tidy in here
}

## make var matrices 
# compute the mean over var
vcov_df <- vcov_df %>%
        summarize_all(mean)

# create matrices
mat1 <- matrix(vcov_df$Grundschule, 1, 1)
mat2 <- matrix(vcov_df$Gymnasium, 1, 1)

variances <- list(mat1, mat2)

## compute mean of estimates
bf_data <- mean_df %>%
        summarize_all(mean)
bf_data <- as.numeric(bf_data)
names(bf_data) <- c("Grund", "Gym")


## BAIN ##### #
# generating hypotheses
hypotheses <- "Gym = Grund; Gym < Grund; Gym > Grund"


bf_45 <- bain(bf_data, 
              hypothesis = hypotheses,
              n = samp_gr,           # size of the groups
              Sigma = variances,     # matrices of residual variances of groups
              group_parameters = 1,  # there is 1 group specific parameter (the mean in each group)
              joint_parameters = 0   # there are no parameters that apply to each of the groups (e.g. the regression coefficient of a covariate)
              )

print(bf_45)
## Bayesian informative hypothesis testing for an object of class numeric:
## 
##    Fit_eq Com_eq Fit_in Com_in Fit   Com   BF           PMPa  PMPb 
## H1 0.000  0.069  1.000  1.000  0.000 0.069 0.000        0.000 0.000
## H2 1.000  1.000  1.000  0.500  1.000 0.500 19180728.204 1.000 0.667
## H3 1.000  1.000  0.000  0.500  0.000 0.500 0.000        0.000 0.000
## Hu                                                            0.333
## 
## Hypotheses:
##   H1: Gym=Grund
##   H2: Gym<Grund
##   H3: Gym>Grund
## 
## Note: BF denotes the Bayes factor of the hypothesis at hand versus its complement.
##              H1           H2           H3
## H1 1.000000e+00 6.970502e-06 1.336994e+02
## H2 1.434617e+05 1.000000e+00 1.918075e+07
## H3 7.479464e-03 5.213562e-08 1.000000e+00

Prädiktor Klassenstufe

Deskriptiver Plot


Inferenzstatistik
Aufgrund fehlender Standards im pooling von Bayes Faktoren (BF), berechnen wir für jeden Datensatz einen Bayes Faktor und berichten anschließend deren Verteilung.

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##   153766  2977665  4352611  5419865  6829702 28537391

90min Stunden

Prädiktor Schulart

Deskriptiver Plot


Inferenzstatistik

#### Compute BFs within informed hypotheses framework (bain) ##################################### #

### for bain: compute vcov by hand ###
# create data frame to collect var and cov for each imputation
vcov_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

mean_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

# sample size per group
samp_gr <- table(mice::complete(imp90)$Schulart)


## loop over all m imputations
for(i in 1:m) {
  fit_lm <- lm(dau_hw_min ~ Schulart-1, data = mice::complete(imp90, i))
  
  var_lm <- summary(fit_lm)$sigma**2 # get variance of the means (VOM)
  vcov_df[i, "Grundschule"] <- var_lm/samp_gr["Grundschule"] # compute VOM per group
  vcov_df[i, "Gymnasium"] <- var_lm/samp_gr["Gymnasium"] # compute VOM per group
  
  # collect estimates of the means
  mean_df[i, "Grundschule"] <- coef(fit_lm)["SchulartGrundschule"]
  mean_df[i, "Gymnasium"] <- coef(fit_lm)["SchulartGymnasium"]
  
  rm(fit_lm, var_lm) # clean up, because we like it tidy in here
}

## make var matrices 
# compute the mean over var
vcov_df <- vcov_df %>%
        summarize_all(mean)

# create matrices
mat1 <- matrix(vcov_df$Grundschule, 1, 1)
mat2 <- matrix(vcov_df$Gymnasium, 1, 1)

variances <- list(mat1, mat2)

## compute mean of estimates
bf_data <- mean_df %>%
        summarize_all(mean)
bf_data <- as.numeric(bf_data)
names(bf_data) <- c("Grund", "Gym")

## BAIN ##### #
# generating hypotheses
hypotheses <- "Gym = Grund; Gym < Grund; Gym > Grund"

bf_90 <- bain(bf_data, 
              hypothesis = hypotheses,
              n = samp_gr,           # size of the groups
              Sigma = variances,     # matrices of residual variances of groups
              group_parameters = 1,  # there is 1 group specific parameter (the mean in each group)
              joint_parameters = 0   # there are no parameters that apply to each of the groups (e.g. the regression coefficient of a covariate)
              )

print(bf_90)
## Bayesian informative hypothesis testing for an object of class numeric:
## 
##    Fit_eq Com_eq Fit_in Com_in Fit   Com   BF       PMPa  PMPb 
## H1 0.001  0.039  1.000  1.000  0.001 0.039 0.025    0.012 0.008
## H2 1.000  1.000  1.000  0.500  1.000 0.500 2896.832 0.987 0.661
## H3 1.000  1.000  0.000  0.500  0.000 0.500 0.000    0.000 0.000
## Hu                                                        0.331
## 
## Hypotheses:
##   H1: Gym=Grund
##   H2: Gym<Grund
##   H3: Gym>Grund
## 
## Note: BF denotes the Bayes factor of the hypothesis at hand versus its complement.
##            H1           H2         H3
## H1  1.0000000 0.0126602106   36.67451
## H2 78.9876273 1.0000000000 2896.83229
## H3  0.0272669 0.0003452047    1.00000

Prädiktor Klassenstufe

Deskriptiver Plot


Inferenzstatistik
Aufgrund fehlender Standards im pooling von Bayes Faktoren (BF), berechnen wir für jeden Datensatz einen Bayes Faktor und berichten anschließend deren Verteilung.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.9624 14.4547 23.2238 21.3723 26.8566 40.5913

Häufigkeiten der Lehrpersonen-/SuS-Handlungen

Deskriptive Daten aller Merkmale

  • Grundschule:

    Table continues below
      vars n mean sd median trimmed mad min
    lh_ank 1 322 0.925 0.263 1 1.000 0.00 0
    lh_auf 2 320 0.809 0.393 1 0.887 0.00 0
    lh_nen 3 319 0.947 0.225 1 1.000 0.00 0
    lh_sch 4 319 0.715 0.452 1 0.767 0.00 0
    lh_erl 5 320 0.684 0.465 1 0.730 0.00 0
    lh_wfr 6 320 0.241 0.428 0 0.176 0.00 0
    lh_wno 7 322 0.696 0.461 1 0.744 0.00 0
    lh_ano 8 321 0.586 0.493 1 0.607 0.00 0
    lh_bfr 9 320 0.781 1.499 0 0.480 0.00 0
    sh_auf 10 320 4.169 0.968 4 4.316 1.48 1
    sh_mel 11 321 0.611 1.207 0 0.319 0.00 0
    sh_fra 12 320 0.844 1.648 0 0.492 0.00 0
    sh_not 13 321 71.240 38.780 90 76.529 14.83 0
      max range skew kurtosis se
    lh_ank 1 1 -3.225 8.426 0.015
    lh_auf 1 1 -1.568 0.460 0.022
    lh_nen 1 1 -3.959 13.716 0.013
    lh_sch 1 1 -0.947 -1.107 0.025
    lh_erl 1 1 -0.790 -1.381 0.026
    lh_wfr 1 1 1.208 -0.543 0.024
    lh_wno 1 1 -0.846 -1.287 0.026
    lh_ano 1 1 -0.346 -1.886 0.028
    lh_bfr 15 15 4.395 31.258 0.084
    sh_auf 5 4 -1.230 1.291 0.054
    sh_mel 9 9 2.668 9.296 0.067
    sh_fra 15 15 3.862 22.788 0.092
    sh_not 100 100 -1.055 -0.590 2.164
  • Gymnasium:

    Table continues below
      vars n mean sd median trimmed mad min
    lh_ank 1 183 0.770 0.422 1 0.837 0.00 0
    lh_auf 2 181 0.591 0.493 1 0.614 0.00 0
    lh_nen 3 123 0.967 0.178 1 1.000 0.00 0
    lh_sch 4 183 0.628 0.485 1 0.660 0.00 0
    lh_erl 5 183 0.650 0.478 1 0.687 0.00 0
    lh_wfr 6 184 0.207 0.406 0 0.135 0.00 0
    lh_wno 7 184 0.321 0.468 0 0.277 0.00 0
    lh_ano 8 179 0.196 0.398 0 0.124 0.00 0
    lh_bfr 9 121 0.661 1.029 0 0.443 0.00 0
    sh_auf 10 108 3.815 1.034 4 3.898 1.48 1
    sh_mel 11 123 0.602 0.981 0 0.394 0.00 0
    sh_fra 12 183 0.705 1.218 0 0.429 0.00 0
    sh_not 13 171 59.865 37.651 70 62.314 44.48 0
      max range skew kurtosis se
    lh_ank 1 1 -1.276 -0.374 0.031
    lh_auf 1 1 -0.368 -1.875 0.037
    lh_nen 1 1 -5.207 25.317 0.016
    lh_sch 1 1 -0.527 -1.731 0.036
    lh_erl 1 1 -0.625 -1.618 0.035
    lh_wfr 1 1 1.438 0.069 0.030
    lh_wno 1 1 0.762 -1.427 0.035
    lh_ano 1 1 1.523 0.320 0.030
    lh_bfr 6 6 2.433 7.657 0.094
    sh_auf 5 4 -0.435 -0.793 0.099
    sh_mel 6 6 2.350 7.428 0.088
    sh_fra 7 7 2.569 7.707 0.090
    sh_not 100 100 -0.443 -1.345 2.879

L kündigt HA an

Prädiktor Schulart

Aufgrund fehlender Standards im pooling von Bayes Faktoren (BF), berechnen wir für jeden Datensatz einen Bayes Faktor und berichten anschließend deren Verteilung.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1730    9621    9621    9294    9621   38702

Effektstärke: \(\varphi\)= -0.218

L fordert Aufmerksamkeit

Prädiktor Schulart

Aufgrund fehlender Standards im pooling von Bayes Faktoren (BF), berechnen wir für jeden Datensatz einen Bayes Faktor und berichten anschließend deren Verteilung.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   11132   71353   85423  131640  159914  774708

Effektstärke: \(\varphi\)= -0.236

L nennt HA

Prädiktor Schulart

Aufgrund fehlender Standards im pooling von Bayes Faktoren (BF), berechnen wir für jeden Datensatz einen Bayes Faktor und berichten anschließend deren Verteilung.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.05133 0.05412 0.06194 0.08576 0.07829 3.59638

Effektstärke: \(\varphi\)= 0.009

L schreibt HA an

Prädiktor Schulart

Aufgrund fehlender Standards im pooling von Bayes Faktoren (BF), berechnen wir für jeden Datensatz einen Bayes Faktor und berichten anschließend deren Verteilung.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.4367  0.6171  0.7092  0.7208  0.8198  1.4633

Effektstärke: \(\varphi\)= -0.086

L erläutert HA

Prädiktor Schulart

Aufgrund fehlender Standards im pooling von Bayes Faktoren (BF), berechnen wir für jeden Datensatz einen Bayes Faktor und berichten anschließend deren Verteilung.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1248  0.1411  0.1488  0.1512  0.1578  0.1906

Effektstärke: \(\varphi\)= -0.036

L will Fragen

Prädiktor Schulart

Aufgrund fehlender Standards im pooling von Bayes Faktoren (BF), berechnen wir für jeden Datensatz einen Bayes Faktor und berichten anschließend deren Verteilung.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1198  0.1420  0.1530  0.1509  0.1530  0.1976

Effektstärke: \(\varphi\)= -0.042

L will Notation

Prädiktor Schulart

Aufgrund fehlender Standards im pooling von Bayes Faktoren (BF), berechnen wir für jeden Datensatz einen Bayes Faktor und berichten anschließend deren Verteilung.

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 1.754e+13 5.581e+13 8.324e+13 9.702e+13 1.502e+14 1.502e+14

Effektstärke: \(\varphi\)= -0.365

L achtet auf Notation

Prädiktor Schulart

Aufgrund fehlender Standards im pooling von Bayes Faktoren (BF), berechnen wir für jeden Datensatz einen Bayes Faktor und berichten anschließend deren Verteilung.

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 2.479e+14 4.433e+15 1.396e+16 1.711e+16 2.613e+16 4.685e+16

Effektstärke: \(\varphi\)= -0.382

L beantwortet Fragen

Prädiktor Schulart

#### Compute BFs within informed hypotheses framework (bain) ##################################### #

### for bain: compute vcov by hand ###
# create data frame to collect var and cov for each imputation
vcov_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

mean_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

# sample size per group
samp_gr <- table(mice::complete(imp)$Schulart)


## loop over all m imputations
for(i in 1:m) {
  fit_lm <- lm(lh_bfr ~ Schulart-1, data = mice::complete(imp, i))
  
  var_lm <- summary(fit_lm)$sigma**2 # get variance of the means (VOM)
  vcov_df[i, "Grundschule"] <- var_lm/samp_gr["Grundschule"] # compute VOM per group
  vcov_df[i, "Gymnasium"] <- var_lm/samp_gr["Gymnasium"] # compute VOM per group
  
  # collect estimates of the means
  mean_df[i, "Grundschule"] <- coef(fit_lm)["SchulartGrundschule"]
  mean_df[i, "Gymnasium"] <- coef(fit_lm)["SchulartGymnasium"]
  
  rm(fit_lm, var_lm) # clean up, because we like it tidy in here
}

## make var matrices 
# compute the mean over var
vcov_df <- vcov_df %>%
        summarize_all(mean)

# create matrices
mat1 <- matrix(vcov_df$Grundschule, 1, 1)
mat2 <- matrix(vcov_df$Gymnasium, 1, 1)

variances <- list(mat1, mat2)

## compute mean of estimates
bf_data <- mean_df %>%
        summarize_all(mean)
bf_data <- as.numeric(bf_data)
names(bf_data) <- c("Grund", "Gym")

## BAIN ##### #
# generating hypotheses
hypotheses <- "Gym = Grund; Gym < Grund; Gym > Grund"

bf_hyp <- bain(bf_data, 
              hypothesis = hypotheses,
              n = samp_gr,           # size of the groups
              Sigma = variances,     # matrices of residual variances of groups
              group_parameters = 1,  # there is 1 group specific parameter (the mean in each group)
              joint_parameters = 0   # there are no parameters that apply to each of the groups (e.g. the regression coefficient of a covariate)
              )

print(bf_hyp)
## Bayesian informative hypothesis testing for an object of class numeric:
## 
##    Fit_eq Com_eq Fit_in Com_in Fit   Com   BF     PMPa  PMPb 
## H1 2.523  0.145  1.000  1.000  2.523 0.145 17.345 0.897 0.853
## H2 1.000  1.000  0.749  0.500  0.749 0.500 2.979  0.077 0.074
## H3 1.000  1.000  0.251  0.500  0.251 0.500 0.336  0.026 0.025
## Hu                                                      0.049
## 
## Hypotheses:
##   H1: Gym=Grund
##   H2: Gym<Grund
##   H3: Gym>Grund
## 
## Note: BF denotes the Bayes factor of the hypothesis at hand versus its complement.
##            H1         H2        H3
## H1 1.00000000 11.5835903 34.510626
## H2 0.08632902  1.0000000  2.979269
## H3 0.02897658  0.3356529  1.000000

SuS sind aufmerksam

Prädiktor Schulart

#### Compute BFs within informed hypotheses framework (bain) ##################################### #

### for bain: compute vcov by hand ###
# create data frame to collect var and cov for each imputation
vcov_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

mean_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

# sample size per group
samp_gr <- table(mice::complete(imp)$Schulart)


## loop over all m imputations
for(i in 1:m) {
  fit_lm <- lm(sh_auf ~ Schulart-1, data = mice::complete(imp, i))
  
  var_lm <- summary(fit_lm)$sigma**2 # get variance of the means (VOM)
  vcov_df[i, "Grundschule"] <- var_lm/samp_gr["Grundschule"] # compute VOM per group
  vcov_df[i, "Gymnasium"] <- var_lm/samp_gr["Gymnasium"] # compute VOM per group
  
  # collect estimates of the means
  mean_df[i, "Grundschule"] <- coef(fit_lm)["SchulartGrundschule"]
  mean_df[i, "Gymnasium"] <- coef(fit_lm)["SchulartGymnasium"]
  
  rm(fit_lm, var_lm) # clean up, because we like it tidy in here
}

## make var matrices 
# compute the mean over var
vcov_df <- vcov_df %>%
        summarize_all(mean)

# create matrices
mat1 <- matrix(vcov_df$Grundschule, 1, 1)
mat2 <- matrix(vcov_df$Gymnasium, 1, 1)

variances <- list(mat1, mat2)

## compute mean of estimates
bf_data <- mean_df %>%
        summarize_all(mean)
bf_data <- as.numeric(bf_data)
names(bf_data) <- c("Grund", "Gym")

## BAIN ##### #
# generating hypotheses
hypotheses <- "Gym = Grund; Gym < Grund; Gym > Grund"

bf_hyp <- bain(bf_data, 
              hypothesis = hypotheses,
              n = samp_gr,           # size of the groups
              Sigma = variances,     # matrices of residual variances of groups
              group_parameters = 1,  # there is 1 group specific parameter (the mean in each group)
              joint_parameters = 0   # there are no parameters that apply to each of the groups (e.g. the regression coefficient of a covariate)
              )

print(bf_hyp)
## Bayesian informative hypothesis testing for an object of class numeric:
## 
##    Fit_eq Com_eq Fit_in Com_in Fit   Com   BF         PMPa  PMPb 
## H1 0.000  0.199  1.000  1.000  0.000 0.199 0.001      0.001 0.000
## H2 1.000  1.000  1.000  0.500  1.000 0.500 176990.839 0.999 0.666
## H3 1.000  1.000  0.000  0.500  0.000 0.500 0.000      0.000 0.000
## Hu                                                          0.333
## 
## Hypotheses:
##   H1: Gym=Grund
##   H2: Gym<Grund
##   H3: Gym>Grund
## 
## Note: BF denotes the Bayes factor of the hypothesis at hand versus its complement.
##              H1          H2          H3
## H1 1.000000e+00 7.07254e-04    125.1775
## H2 1.413919e+03 1.00000e+00 176990.8373
## H3 7.988658e-03 5.65001e-06      1.0000

Plot der deskriptiven Daten

###### DESCRIPTIVE PLOT ############################################### #

sh_auf_p <- p_data %>%
  dplyr::filter(!is.na(sh_auf) & !is.na(Klassenstufe)) %>%
  dplyr::group_by(Klassenstufe) %>%
  dplyr::mutate(length = length(sh_auf)) %>%
  dplyr::summarize(sh_auf = mean(sh_auf, na.rm=T),
            length_n = mean(length)) %>%
  ungroup() %>%
  mutate(Klassenstufe = factor(Klassenstufe, levels = c("1", "1_2", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12")),
         Schulart = as.factor(case_when(
                      Klassenstufe == "1" ~ "Grundschule",
                      Klassenstufe == "1_2" ~ "Grundschule",
                      Klassenstufe == "2" ~ "Grundschule",
                      Klassenstufe == "3" ~ "Grundschule",
                      Klassenstufe == "4" ~ "Grundschule",
                      TRUE ~ "Gymnasium"
         )))

ggplot(p_data_ridges, aes(x=as.factor(Klassenstufe), y = sh_auf, fill = Schulart,
                          colour = Schulart, group = as.factor(Klassenstufe))) +
    geom_flat_violin(adjust = 1,
                     trim = F,
                     alpha = .3,
                     # scale = "count",
                     width = 2) +
    stat_summary(fun.y = mean, geom = "line", aes(group = 1), position = position_nudge(x=.1), size = 1, color = "grey") +
    stat_summary(aes(size = sh_auf_n), fun.y = mean, geom = "point",
                 alpha = .85, position = position_nudge(x=.1)) +
    geom_text(data = sh_auf_p, aes(label = round(sh_auf, 2), vjust = 2, hjust = -.1), position = position_dodge(width = 1), color = "black") +
    scale_colour_brewer(palette = "Set1")+
    scale_fill_brewer(palette = "Set1") +
    scale_y_continuous(expand = c(0, 0), breaks = c(1:5), limits = c(1, 5)) +
    scale_x_discrete(expand = c(0, 0), breaks = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"),
                     limits = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13")) +
    xlab("Klassenstufe") +
    ylab("Zustimmung (Likert Item)") +
    ggtitle("Eingeschätzte Aufmerksamkeit der SuS") +
    labs(size = "Anzahl\neingegangener\nStunden") +
    theme_light()

SuS melden sich

Prädiktor Schulart

#### Compute BFs within informed hypotheses framework (bain) ##################################### #

### for bain: compute vcov by hand ###
# create data frame to collect var and cov for each imputation
vcov_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

mean_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

# sample size per group
samp_gr <- table(mice::complete(imp)$Schulart)


## loop over all m imputations
for(i in 1:m) {
  fit_lm <- lm(sh_mel ~ Schulart-1, data = mice::complete(imp, i))
  
  var_lm <- summary(fit_lm)$sigma**2 # get variance of the means (VOM)
  vcov_df[i, "Grundschule"] <- var_lm/samp_gr["Grundschule"] # compute VOM per group
  vcov_df[i, "Gymnasium"] <- var_lm/samp_gr["Gymnasium"] # compute VOM per group
  
  # collect estimates of the means
  mean_df[i, "Grundschule"] <- coef(fit_lm)["SchulartGrundschule"]
  mean_df[i, "Gymnasium"] <- coef(fit_lm)["SchulartGymnasium"]
  
  rm(fit_lm, var_lm) # clean up, because we like it tidy in here
}

## make var matrices 
# compute the mean over var
vcov_df <- vcov_df %>%
        summarize_all(mean)

# create matrices
mat1 <- matrix(vcov_df$Grundschule, 1, 1)
mat2 <- matrix(vcov_df$Gymnasium, 1, 1)

variances <- list(mat1, mat2)

## compute mean of estimates
bf_data <- mean_df %>%
        summarize_all(mean)
bf_data <- as.numeric(bf_data)
names(bf_data) <- c("Grund", "Gym")

## BAIN ##### #
# generating hypotheses
hypotheses <- "Gym = Grund; Gym < Grund; Gym > Grund"

bf_hyp <- bain(bf_data, 
              hypothesis = hypotheses,
              n = samp_gr,           # size of the groups
              Sigma = variances,     # matrices of residual variances of groups
              group_parameters = 1,  # there is 1 group specific parameter (the mean in each group)
              joint_parameters = 0   # there are no parameters that apply to each of the groups (e.g. the regression coefficient of a covariate)
              )

print(bf_hyp)
## Bayesian informative hypothesis testing for an object of class numeric:
## 
##    Fit_eq Com_eq Fit_in Com_in Fit   Com   BF     PMPa  PMPb 
## H1 3.695  0.170  1.000  1.000  3.695 0.170 21.714 0.916 0.879
## H2 1.000  1.000  0.504  0.500  0.504 0.500 1.018  0.043 0.041
## H3 1.000  1.000  0.496  0.500  0.496 0.500 0.983  0.042 0.040
## Hu                                                      0.040
## 
## Hypotheses:
##   H1: Gym=Grund
##   H2: Gym<Grund
##   H3: Gym>Grund
## 
## Note: BF denotes the Bayes factor of the hypothesis at hand versus its complement.
##            H1         H2        H3
## H1 1.00000000 21.5251188 21.906863
## H2 0.04645735  1.0000000  1.017735
## H3 0.04564779  0.9825742  1.000000

Plot der deskriptiven Daten

###### DESCRIPTIVE PLOT ############################################### #

sh_mel_p <- p_data %>%
  dplyr::filter(!is.na(sh_mel) & !is.na(Klassenstufe)) %>%
  dplyr::group_by(Klassenstufe) %>%
  dplyr::mutate(length = length(sh_mel)) %>%
  dplyr::summarize(sh_mel_sd = sd(sh_mel, na.rm=T),
                   sh_mel = mean(sh_mel, na.rm=T),
                   length_n = mean(length)) %>%
  ungroup() %>%
  mutate(Klassenstufe = factor(Klassenstufe, levels = c("1", "1_2", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12")),
         Schulart = as.factor(case_when(
                      Klassenstufe == "1" ~ "Grundschule",
                      Klassenstufe == "1_2" ~ "Grundschule",
                      Klassenstufe == "2" ~ "Grundschule",
                      Klassenstufe == "3" ~ "Grundschule",
                      Klassenstufe == "4" ~ "Grundschule",
                      TRUE ~ "Gymnasium"
         )))

ggplot(p_data_ridges, aes(x=as.factor(Klassenstufe), y = sh_mel, fill = Schulart,
                          colour = Schulart, group = as.factor(Klassenstufe))) +
    geom_flat_violin(adjust = 1,
                     trim = F,
                     alpha = .3,
                     # scale = "count",
                     width = 2) +
    stat_summary(fun.y = mean, geom = "line", aes(group = 1), position = position_nudge(x=.1), size = 1, color = "grey") +
    stat_summary(aes(size = sh_mel_n), fun.y = mean, geom = "point",
                 alpha = .85, position = position_nudge(x=.1)) +
    geom_text(data = sh_mel_p, aes(label = sub("^(-?)0.", "\\1.", round(sh_mel, 2)), vjust = -0.5, hjust = -.2), position = position_dodge(width = 1), color = "black") +
    scale_colour_brewer(palette = "Set1")+
    scale_fill_brewer(palette = "Set1") +
    scale_y_continuous(expand = c(0, 0), breaks = c(0:7), limits = c(0, 3)) +
    scale_x_discrete(expand = c(0, 0), breaks = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"),
                     limits = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13")) +
    xlab("Klassenstufe") +
    ylab("absolute Häufigkeit") +
    ggtitle("Anzahl Schülermeldungen") +
    labs(size = "Anzahl\neingegangener\nStunden") +
    theme_light()

SuS fragen

Prädiktor Schulart

#### Compute BFs within informed hypotheses framework (bain) ##################################### #

### for bain: compute vcov by hand ###
# create data frame to collect var and cov for each imputation
vcov_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

mean_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

# sample size per group
samp_gr <- table(mice::complete(imp)$Schulart)


## loop over all m imputations
for(i in 1:m) {
  fit_lm <- lm(sh_fra ~ Schulart-1, data = mice::complete(imp, i))
  
  var_lm <- summary(fit_lm)$sigma**2 # get variance of the means (VOM)
  vcov_df[i, "Grundschule"] <- var_lm/samp_gr["Grundschule"] # compute VOM per group
  vcov_df[i, "Gymnasium"] <- var_lm/samp_gr["Gymnasium"] # compute VOM per group
  
  # collect estimates of the means
  mean_df[i, "Grundschule"] <- coef(fit_lm)["SchulartGrundschule"]
  mean_df[i, "Gymnasium"] <- coef(fit_lm)["SchulartGymnasium"]
  
  rm(fit_lm, var_lm) # clean up, because we like it tidy in here
}

## make var matrices 
# compute the mean over var
vcov_df <- vcov_df %>%
        summarize_all(mean)

# create matrices
mat1 <- matrix(vcov_df$Grundschule, 1, 1)
mat2 <- matrix(vcov_df$Gymnasium, 1, 1)

variances <- list(mat1, mat2)

## compute mean of estimates
bf_data <- mean_df %>%
        summarize_all(mean)
bf_data <- as.numeric(bf_data)
names(bf_data) <- c("Grund", "Gym")

## BAIN ##### #
# generating hypotheses
hypotheses <- "Gym = Grund; Gym < Grund; Gym > Grund"

bf_hyp <- bain(bf_data, 
              hypothesis = hypotheses,
              n = samp_gr,           # size of the groups
              Sigma = variances,     # matrices of residual variances of groups
              group_parameters = 1,  # there is 1 group specific parameter (the mean in each group)
              joint_parameters = 0   # there are no parameters that apply to each of the groups (e.g. the regression coefficient of a covariate)
              )

print(bf_hyp)
## Bayesian informative hypothesis testing for an object of class numeric:
## 
##    Fit_eq Com_eq Fit_in Com_in Fit   Com   BF     PMPa  PMPb 
## H1 1.716  0.133  1.000  1.000  1.716 0.133 12.940 0.866 0.812
## H2 1.000  1.000  0.846  0.500  0.846 0.500 5.475  0.113 0.106
## H3 1.000  1.000  0.154  0.500  0.154 0.500 0.183  0.021 0.019
## Hu                                                      0.063
## 
## Hypotheses:
##   H1: Gym=Grund
##   H2: Gym<Grund
##   H3: Gym>Grund
## 
## Note: BF denotes the Bayes factor of the hypothesis at hand versus its complement.
##            H1        H2        H3
## H1 1.00000000 7.6514875 41.892639
## H2 0.13069354 1.0000000  5.475097
## H3 0.02387054 0.1826452  1.000000

Plot der deskriptiven Daten

###### DESCRIPTIVE PLOT ############################################### #

sh_fra_p <- p_data %>%
  dplyr::filter(!is.na(sh_fra) & !is.na(Klassenstufe)) %>%
  group_by(Klassenstufe) %>%
  mutate(length = length(sh_fra)) %>%
  summarize(sh_fra = mean(sh_fra, na.rm=T),
            length_n = mean(length)) %>%
  ungroup() %>%
  mutate(Klassenstufe = factor(Klassenstufe, levels = c("1", "1_2", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12")),
         Schulart = as.factor(case_when(
                      Klassenstufe == "1" ~ "Grundschule",
                      Klassenstufe == "1_2" ~ "Grundschule",
                      Klassenstufe == "2" ~ "Grundschule",
                      Klassenstufe == "3" ~ "Grundschule",
                      Klassenstufe == "4" ~ "Grundschule",
                      TRUE ~ "Gymnasium"
         )))

ggplot(p_data_ridges, aes(x=as.factor(Klassenstufe), y = sh_fra, fill = Schulart,
                          colour = Schulart, group = as.factor(Klassenstufe))) +
    geom_flat_violin(adjust = 1,
                     trim = F,
                     alpha = .3,
                     # scale = "count",
                     width = 2) +
    stat_summary(fun.y = mean, geom = "line", aes(group = 1), position = position_nudge(x=.1), size = 1, color = "grey") +
    stat_summary(aes(size = sh_fra_n), fun.y = mean, geom = "point",
                 alpha = .85, position = position_nudge(x=.1)) +
    geom_text(data = sh_fra_p, aes(label = sub("^(-?)0.", "\\1.", round(sh_fra, 2)), vjust = -.5, hjust = -.1), position = position_dodge(width = 1), 
              color = "black") +
    scale_colour_brewer(palette = "Set1")+
    scale_fill_brewer(palette = "Set1") +
    scale_y_continuous(expand = c(0, 0), breaks = c(0:3), limits = c(0, 3)) +
    scale_x_discrete(expand = c(0, 0), breaks = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"),
                     limits = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13")) +
    xlab("Klassenstufe") +
    ylab("absolute Häufigkeit") +
    ggtitle("Anzahl Schülerfragen") +
    labs(size = "Anzahl\neingegangener\nStunden") +
    theme_light()

SuS notieren

Prädiktor Schulart

#### Compute BFs within informed hypotheses framework (bain) ##################################### #

### for bain: compute vcov by hand ###
# create data frame to collect var and cov for each imputation
vcov_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

mean_df <- data.frame(Grundschule = as.numeric(),
                      Gymnasium = as.numeric()
                      )

# sample size per group
samp_gr <- table(mice::complete(imp)$Schulart)


## loop over all m imputations
for(i in 1:m) {
  fit_lm <- lm(sh_not ~ Schulart-1, data = mice::complete(imp, i))
  
  var_lm <- summary(fit_lm)$sigma**2 # get variance of the means (VOM)
  vcov_df[i, "Grundschule"] <- var_lm/samp_gr["Grundschule"] # compute VOM per group
  vcov_df[i, "Gymnasium"] <- var_lm/samp_gr["Gymnasium"] # compute VOM per group
  
  # collect estimates of the means
  mean_df[i, "Grundschule"] <- coef(fit_lm)["SchulartGrundschule"]
  mean_df[i, "Gymnasium"] <- coef(fit_lm)["SchulartGymnasium"]
  
  rm(fit_lm, var_lm) # clean up, because we like it tidy in here
}

## make var matrices 
# compute the mean over var
vcov_df <- vcov_df %>%
        summarize_all(mean)

# create matrices
mat1 <- matrix(vcov_df$Grundschule, 1, 1)
mat2 <- matrix(vcov_df$Gymnasium, 1, 1)

variances <- list(mat1, mat2)

## compute mean of estimates
bf_data <- mean_df %>%
        summarize_all(mean)
bf_data <- as.numeric(bf_data)
names(bf_data) <- c("Grund", "Gym")

## BAIN ##### #
# generating hypotheses
hypotheses <- "Gym = Grund; Gym < Grund; Gym > Grund"

bf_hyp <- bain(bf_data, 
              hypothesis = hypotheses,
              n = samp_gr,           # size of the groups
              Sigma = variances,     # matrices of residual variances of groups
              group_parameters = 1,  # there is 1 group specific parameter (the mean in each group)
              joint_parameters = 0   # there are no parameters that apply to each of the groups (e.g. the regression coefficient of a covariate)
              )

print(bf_hyp)
## Bayesian informative hypothesis testing for an object of class numeric:
## 
##    Fit_eq Com_eq Fit_in Com_in Fit   Com   BF       PMPa  PMPb 
## H1 0.000  0.005  1.000  1.000  0.000 0.005 0.068    0.033 0.022
## H2 1.000  1.000  1.000  0.500  1.000 0.500 2905.398 0.967 0.652
## H3 1.000  1.000  0.000  0.500  0.000 0.500 0.000    0.000 0.000
## Hu                                                        0.326
## 
## Hypotheses:
##   H1: Gym=Grund
##   H2: Gym<Grund
##   H3: Gym>Grund
## 
## Note: BF denotes the Bayes factor of the hypothesis at hand versus its complement.
##             H1           H2         H3
## H1  1.00000000 0.0342085490   99.38945
## H2 29.23245880 1.0000000000 2905.39808
## H3  0.01006143 0.0003441869    1.00000

Plot der deskriptiven Daten

###### DESCRIPTIVE PLOT ############################################### #

sh_not_p <- p_data %>%
  dplyr::filter(!is.na(sh_not) & !is.na(Klassenstufe)) %>%
  dplyr::group_by(Klassenstufe) %>%
  dplyr::mutate(length = length(sh_not)) %>%
  dplyr::summarize(sh_not = mean(sh_not, na.rm=T),
            length_n = mean(length)) %>%
  ungroup() %>%
  mutate(Klassenstufe = factor(Klassenstufe, levels = c("1", "1_2", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12")),
         Schulart = as.factor(case_when(
                      Klassenstufe == "1" ~ "Grundschule",
                      Klassenstufe == "1_2" ~ "Grundschule",
                      Klassenstufe == "2" ~ "Grundschule",
                      Klassenstufe == "3" ~ "Grundschule",
                      Klassenstufe == "4" ~ "Grundschule",
                      TRUE ~ "Gymnasium"
         )))


p_data_ridges <- p_data_ridges %>%
    mutate(sh_not_p = sh_not/100)

sh_not_p <- sh_not_p %>%
    mutate(sh_not_p = sh_not/100)

ggplot(p_data_ridges, aes(x=as.factor(Klassenstufe), y = sh_not_p, fill = Schulart,
                          colour = Schulart, group = as.factor(Klassenstufe))) +
    geom_flat_violin(adjust = 1,
                     trim = F,
                     alpha = .3,
                     # scale = "count",
                     width = 2) +
    stat_summary(fun.y = mean, geom = "line", aes(group = 1), position = position_nudge(x=.1), size = 1, color = "grey") +
    stat_summary(aes(size = sh_not_n), fun.y = mean, geom = "point",
                 alpha = .85, position = position_nudge(x=.1)) +
    geom_text(data = sh_not_p, aes(label = round(sh_not_p, 2)*100, vjust = 2, hjust = -.3),
              position = position_dodge(width = 1), color = "black") +
    scale_colour_brewer(palette = "Set1")+
    scale_fill_brewer(palette = "Set1") +
    scale_y_continuous(expand = c(0, 0), breaks = c(0,0.2,0.4,0.6,0.8,1), limits = c(0, 1),
                       labels = c("0", "20", "40", "60", "80", "100")) +
    scale_x_discrete(expand = c(0, 0), breaks = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"),
                     limits = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13")) +
    xlab("Klassenstufe") +
    ylab("∅ relative Häufigkeit [%]") +
    ggtitle("Anteil der notierenden SuS") +
    labs(size = "Anzahl\neingegangener\nStunden") +
    theme_light()

Korrelationen

Bivariate Inter-Item-Korrelationen

In Abhängigkeit von der Kombination der Skalenniveaus beider Variablen, werden entsprechende Korrelationen berechnet.

  • Beide Variablen metrisch: Pearson’s Produkt-Moment-Korrelation \(r\)
  • Eine Variable metrisch, eine Variable dichotom: Punkt-Biseriale-Korrelation \(r_{pb}\) (in diesem Fall mathematisch äquivalent zu Pearson’s Produkt-Moment-Korrelation)
  • Beide Variablen dichotom: Pearson’s \(\varphi\)

Bei großen Stichproben approximiert Pearson’s \(\varphi\) die Pearson Produkt-Moment-Korrelation \(r\) und so werden die berechneten Werten für \(\varphi\) und \(r\) in ihrer Größe vergleichbar.


Computing all correlations using Pearson. Correlations between two dichotomous variables will be replaced later by Pearson’s \(\varphi\) (see below). To compute and replace them later anyway doesn’t make hell of a lot sense, but it was just easier/quicker to compute all correlations (with the full data set) and then replace the dichotomous ones afterwards.

Computing dichotomous correlations.


In order to combine correlation estimates via Rubin’s Rules we first have to apply Fisher-Transformation for Correlation to z-Score as correlations are not normally distributed and then transform the values back to correlation coefficients. https://dx.doi.org/10.1186%2F1471-2288-9-57


Bitte mit der Maus über die einzelnen Punkte fahren, um zu erfahren um welche Korrelation es sich handelt.

Genaue Werte der Korrelationen

Korrelation rGS rGY
lh_ank-lh_bfr 0.061 0.008
lh_ank-sh_auf 0.234 0.244
lh_ank-sh_mel 0.028 0.054
lh_ank-sh_fra 0.052 0.023
lh_ank-sh_not 0.158 0.150
lh_auf-lh_bfr 0.004 0.056
lh_auf-sh_auf 0.281 0.093
lh_auf-sh_mel -0.031 0.040
lh_auf-sh_fra -0.031 0.045
lh_auf-sh_not 0.077 -0.006
lh_nen-lh_bfr 0.031 -0.021
lh_nen-sh_auf 0.099 0.152
lh_nen-sh_mel 0.063 -0.033
lh_nen-sh_fra 0.029 -0.035
lh_nen-sh_not -0.012 -0.046
lh_sch-lh_bfr -0.073 0.091
lh_sch-sh_auf 0.105 0.110
lh_sch-sh_mel 0.002 -0.004
lh_sch-sh_fra -0.055 0.074
lh_sch-sh_not 0.544 0.585
lh_erl-lh_bfr 0.183 0.151
lh_erl-sh_auf 0.307 0.111
lh_erl-sh_mel 0.176 0.098
lh_erl-sh_fra 0.143 0.150
lh_erl-sh_not 0.095 0.143
lh_wfr-lh_bfr 0.155 0.078
lh_wfr-sh_auf 0.085 0.185
lh_wfr-sh_mel 0.324 0.089
lh_wfr-sh_fra 0.137 0.048
lh_wfr-sh_not 0.126 0.052
lh_bfr-lh_wno -0.071 0.133
lh_bfr-lh_ano -0.002 0.132
lh_bfr-sh_auf 0.074 -0.001
lh_bfr-sh_mel 0.578 0.746
lh_bfr-sh_fra 0.937 0.918
lh_bfr-sh_not -0.069 0.100
lh_wno-sh_auf 0.070 0.120
lh_wno-sh_mel 0.050 0.008
lh_wno-sh_fra -0.044 0.122
lh_wno-sh_not 0.793 0.504
lh_ano-sh_auf 0.104 0.253
lh_ano-sh_mel 0.093 0.081
lh_ano-sh_fra 0.017 0.133
lh_ano-sh_not 0.702 0.440
sh_auf-sh_mel 0.059 -0.051
sh_auf-sh_fra 0.084 -0.001
sh_auf-sh_not 0.105 0.131
sh_mel-sh_fra 0.565 0.753
sh_mel-sh_not 0.053 0.005
sh_fra-sh_not -0.050 0.087
lh_ank-lh_auf 0.192 0.178
lh_ank-lh_nen -0.013 0.001
lh_ank-lh_sch 0.111 0.222
lh_ank-lh_erl 0.149 0.186
lh_ank-lh_wfr 0.102 0.114
lh_ank-lh_wno 0.181 0.108
lh_ank-lh_ano 0.133 0.126
lh_auf-lh_nen 0.113 0.161
lh_auf-lh_sch 0.045 0.111
lh_auf-lh_erl 0.144 0.154
lh_auf-lh_wfr 0.143 0.176
lh_auf-lh_wno 0.120 0.136
lh_auf-lh_ano 0.116 0.170
lh_nen-lh_sch 0.097 -0.107
lh_nen-lh_erl 0.136 0.025
lh_nen-lh_wfr 0.031 0.011
lh_nen-lh_wno 0.057 -0.046
lh_nen-lh_ano 0.056 0.004
lh_sch-lh_erl 0.113 0.148
lh_sch-lh_wfr 0.088 0.077
lh_sch-lh_wno 0.464 0.377
lh_sch-lh_ano 0.348 0.292
lh_erl-lh_wfr 0.210 0.143
lh_erl-lh_wno 0.030 0.235
lh_erl-lh_ano 0.136 0.208
lh_wfr-lh_wno 0.139 0.165
lh_wfr-lh_ano 0.150 0.189
lh_wno-lh_ano 0.637 0.556


Korrelationen des Zeitpunkts und Dauer der Hausaufgabenvergabe
über beide Schularten hinweg.

Stundenlaenge variable1 variable2 r
Einzelstunde beg_hw_min dau_hw_min -0.326
Doppelstunde beg_hw_min dau_hw_min -0.485



Korrelationen des Zeitpunkts und Dauer der Hausaufgabenvergabe
nach Schulart getrennt.

# filter for Grundschule and Einzelstunde
p_data_45_gs <- p_data %>%
  filter(Schulart == "Grundschule" & stunde == 45)

imp_45_gs <- mice(p_data_45_gs, 
                  maxit = maxit, 
                  m = m,
                  meth = meth,
                  pred = pred,
                  seed = 666,
                  printFlag = F
                  )

# filter for Grundschule and Doppelstunde
p_data_90_gs <- p_data %>%
  filter(Schulart == "Grundschule" & stunde == 90) 

imp_90_gs <- mice(p_data_90_gs, 
                  maxit = maxit, 
                  m = m,
                  meth = meth,
                  pred = pred,
                  seed = 666,
                  printFlag = F
                  )

# filter for Gymnasium and Einzelstunde
p_data_45_gy <- p_data %>%
  filter(Schulart == "Gymnasium" & stunde == 45)

imp_45_gy <- mice(p_data_45_gy, 
                  maxit = maxit, 
                  m = m,
                  meth = meth,
                  pred = pred,
                  seed = 666,
                  printFlag = F
                  )

# filter for Gymnasium and Doppelstunde
p_data_90_gy <- p_data %>%
  filter(Schulart == "Gymnasium" & stunde == 90) 

imp_90_gy <- mice(p_data_90_gy, 
                  maxit = maxit, 
                  m = m,
                  meth = meth,
                  pred = pred,
                  seed = 666,
                  printFlag = F
                  )

cor45_hw_gs <- miceadds::micombine.cor(imp_45_gs, variables = c("beg_hw_min", "dau_hw_min"))
cor45_hw_gs <- cor45_hw_gs %>%
    select(variable1, variable2, r)

cor90_hw_gs <- miceadds::micombine.cor(imp_90_gs, variables = c("beg_hw_min", "dau_hw_min"))
cor90_hw_gs <- cor90_hw_gs %>%
    select(variable1, variable2, r)

cor45_hw_gy <- miceadds::micombine.cor(imp_45_gy, variables = c("beg_hw_min", "dau_hw_min"))
cor45_hw_gy <- cor45_hw_gy %>%
    select(variable1, variable2, r)

cor90_hw_gy <- miceadds::micombine.cor(imp_90_gy, variables = c("beg_hw_min", "dau_hw_min"))
cor90_hw_gy <- cor90_hw_gy %>%
    select(variable1, variable2, r)

cor_hw_sa <- bind_rows(cor45_hw_gs[1,], cor90_hw_gs[1,], cor45_hw_gy[1,], cor90_hw_gy[1,])
cor_hw_sa$Stundenlaenge <- c("Grundschule Einzelstunde", "Grundschule Doppelstunde", "Gymnasium Einzelstunde", "Gymnasium Doppelstunde")
cor_hw_sa <- cor_hw_sa[,c("Stundenlaenge", "variable1", "variable2", "r")]
pander(cor_hw_sa)
Stundenlaenge variable1 variable2 r
Grundschule Einzelstunde beg_hw_min dau_hw_min -0.280
Grundschule Doppelstunde beg_hw_min dau_hw_min -0.507
Gymnasium Einzelstunde beg_hw_min dau_hw_min -0.317
Gymnasium Doppelstunde beg_hw_min dau_hw_min -0.336

Effektstärken

sofern nicht bereits oben berechnet.

Effektstärken der Klassenstufe

zur Vergleichbarkeit mit der Schulart, wurden die Effektmaße als Korrelation berechnet.


In Abhängigkeit von der Kombination der Skalenniveaus beider Variablen, werden entsprechende Korrelationen berechnet:

  • Beide Variablen metrisch: Pearson-Korrelation
  • Eine Variable metrisch, eine Variable dichotom: Punkt-Biseriale-Korrelation (in diesem Fall mathematisch äquivalent zu Pearson)
variable1 variable2 r
Klassenstufe lh_ank -0.216
Klassenstufe lh_auf -0.249
Klassenstufe lh_nen 0.006
Klassenstufe lh_sch -0.066
Klassenstufe lh_erl -0.041
Klassenstufe lh_wfr 0.015
Klassenstufe lh_bfr -0.037
Klassenstufe lh_wno -0.350
Klassenstufe lh_ano -0.367
Klassenstufe sh_auf -0.215
Klassenstufe sh_mel 0.000
Klassenstufe sh_fra -0.053
Klassenstufe sh_not -0.131
Klassenstufe beg_hw_min 0.215
Klassenstufe dau_hw_min -0.286
Klassenstufe beg_hw_min 0.318
Klassenstufe dau_hw_min -0.357

Effektstärken der Schulart

Die Effekttärken der Schulart auf dichotome Variablen wurden bereits jeweils im Zuger der bayesianischen Hypothesentets (anhand des Pearson’s \(\varphi\)) berechnet, da hier ebenfalls Schleifen programmiert werden mussten. An dieser Stelle werden nun zusätzlich die Effektstärken auf die metrischen Variablen anhand der Punkt-Biserialen Korrelation \(r_{pb}\) nach Pearson brechnet. Die Variable “Schulart” wurde hierfür in einer nummerische Variable umcodiert, wobei Grundschule = 0 und Gymnasium = 1.

variable1 variable2 r
Schulart_n lh_bfr -0.030
Schulart_n sh_auf -0.191
Schulart_n sh_mel -0.001
Schulart_n sh_fra -0.045
Schulart_n sh_not -0.149
Schulart_n beg_hw_min 0.216
Schulart_n dau_hw_min -0.252
Schulart_n beg_hw_min 0.240
Schulart_n dau_hw_min -0.386